Faster Text Fingerprinting
نویسندگان
چکیده
Let s = s1..sn be a text (or sequence) on a finite alphabet Σ. A fingerprint in s is the set of distinct characters contained in one of its substrings. Fingerprinting a text consists of computing the set F of all fingerprints of all its substrings. A fingerprint, f ∈ F , admits a number of maximal locations 〈i, j〉 in S, that is the alphabet of si..sj is f and si−1, sj+1, if defined, are not in f . The set of maximal locations is L, |L| ≤ n|Σ|. Two maximal locations 〈i, j〉 and 〈k, l〉 such that si..sj = sk..sl are named copies and the quotient of L according to the copy relation is named LC . The faster algorithm to compute all fingerprints in s runs in O(n+ |L| log |Σ|) time. We present a quite always faster O((n+ |LC |) log |Σ|) algorithm.
منابع مشابه
A Faster Query Algorithm for the Text Fingerprinting Problem
Article history: Received 15 February 2009 Revised 14 January 2011 Available online 7 April 2011
متن کاملPlagiarism checker for Persian (PCP) texts using hash-based tree representative fingerprinting
With due respect to the authors’ rights, plagiarism detection, is one of the critical problems in the field of text-mining that many researchers are interested in. This issue is considered as a serious one in high academic institutions. There exist language-free tools which do not yield any reliable results since the special features of every language are ignored in them. Considering the paucit...
متن کاملInformation Hiding for Text by Paraphrasing
Digital fingerprinting becomes paid growing attention as a technology resolving copyright problems. Previously, researchers have been only interested in image based digital fingerprinting where secret information is hidden in images, and text have not been the main target of hiding information. In this paper, we propose an information hiding method for text. Our information hiding method is bas...
متن کاملDigital Fingerprinting Based on Keystroke Dynamics
Digital fingerprinting is an important but still challenging aspect of network forensics. This paper introduces an effective way to identify an attacker based on a strong behavioral biometric. We introduce a new passive digital fingerprinting technique based on keystroke dynamics biometrics. The technique is based on free text detection and analysis of keystroke dynamics. It allows building a b...
متن کاملFingerprinting of Digital Information—Introduction and some Preliminary Results
Coding methods for fingerprinting digital information are considered, with the aim of deterring users from copyright violation. A general model for discrete fingerprinting is presented, along with a simple embedding method for text documents. Different types of attacks are discussed, including attacks from colluding pirates. Some preliminary results are derived for random fingerprinting codes. ...
متن کامل